A Trace Cache Microarchitecture and Evaluation

نویسندگان

Eric Rotenberg

Steve Bennett

James E. Smith

چکیده

As the instruction issue width of superscalar processors increases, instruction fetch bandwidth requirements will also increase. It will eventually become necessary to fetch multiple basic blocks per clock cycle. Conventional instruction caches hinder this effort because long instruction sequences are not always in contiguous cache locations. Trace caches overcome this limitation by caching traces of the dynamic instruction stream, so instructions that are otherwise noncontiguous appear contiguous. In this paper we present and evaluate a microarchitecture incorporating a trace cache. The microarchitecture provides high instruction fetch bandwidth with low latency by explicitly sequencing through the program at the higher level of traces, both in terms of (1) control flow prediction and (2) instruction supply. For the SPEC95 integer benchmarks, trace-level sequencing improves performance from 15% to 35% over an otherwise equally-sophisticated, but contiguous multipleblock fetch mechanism. Most of this performance improvement is due to the trace cache. However, for one benchmark whose performance is limited by branch mispredictions, the performance gain is due almost entirely to improved prediction accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trace Cache Performance

Instruction fetch mechanism is a performance bottleneck of a Superscalar Processor. Fetch performance can be improved with the aid of an instruction memory known as a Trace Cache. This paper presents analytical expressions, which describe instruction fetch performance of a Trace Cache microarchitecture. The instruction fetch rates predicted by the expressions differ by seven percent from the si...

متن کامل

Turboscalar: A High Frequency High IPC Microarchitecture

There is significant performance motivation to build larger and wider superscalar machines, however the implementation complexity can be overwhelming. When superscalar machines grow they necessarily become deeper in order to maintain frequency. As the pipeline depth increases the performance gained by a wide instruction fetch and dispatch is lost to branch misprediction penalty cycles. This wor...

متن کامل

Summarizing multiprocessor program execution with versatile, microarchitecture-independent snapshots

Computer architects rely heavily on software simulation to evaluate, refine, and validate new designs before they are implemented. However, simulation time continues to increase as computers become more complex and multicore designs become more common. This thesis investigates software structures and algorithms for quickly simulating modern cache-coherent multiprocessors by amortizing the time ...

متن کامل

A survey of new research directions in microprocessors

Current microprocessors utilise the instruction-level parallelism by a deep processor pipeline and the superscalar instruction issue technique. VLSI technology offers several solutions for aggressive exploitation of the instruction-level parallelism in future generations of microprocessors. Technological advances will replace the gate delay by on-chip wire delay as the main obstacle to increase...

متن کامل

Microarchitecture Characteristics and Implications of Alignment of Multiple Bioinformatics Sequences

With the growth of bioinformatics and computational biology industry, multiple sequence alignment (MSA) applications have become an important emerging workload. In spite of the large amount of recent attention given to the MSA software design, there has been little quantitative understanding of the performance of such applications on modern microprocessors and systems. In this paper, we analyze...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEEE Trans. Computers

دوره 48 شماره

صفحات -

تاریخ انتشار 1999

A Trace Cache Microarchitecture and Evaluation

نویسندگان

چکیده

منابع مشابه

Trace Cache Performance

Turboscalar: A High Frequency High IPC Microarchitecture

Summarizing multiprocessor program execution with versatile, microarchitecture-independent snapshots

A survey of new research directions in microprocessors

Microarchitecture Characteristics and Implications of Alignment of Multiple Bioinformatics Sequences

عنوان ژورنال:

اشتراک گذاری